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Abstract: Wind energy plays an important role as a contributing source of energy, as well as, and in 
future. It has become very important to predict the speed and direction in wind farms. Effective wind 
prediction has always been challenged by the nonlinear and non-stationary characteristics of the wind 
stream. This paper presents three new models for wind speed forecasting, a day ahead, for Egyptian 
North-Western Mediterranean coast. These wind speed models are based on adaptive neuro-fuzzy 
inference system (ANFIS) estimation scheme. The first proposed model predicts wind speed for one 
day ahead twenty four hours based on same month o f real data in seven consecutive years. The second 
proposed model predicts twenty four hours ahead based only one month of data using a time series 
predication schemes. The third proposed model is based on one month of data to predict twenty four 
hours ahead; the data initially passed through discrete Kalman filter (KF) for the purpose of 
minimizing the noise contents that resulted from the uncertainties encountered during the wind speed 
measurement. Kalman filtered data manipulated by the third model showed better estimation results 
over the other two models, and decreased the mean absolute percentage error by approximately 64 < c 
over the first model. 

Keywords: Kalman inhering. Forecasting, Stale Estimation, lime series. Adaptive Neuro-Fuzzy 
Inference System. 



Exponential increase in energy demand globally is leading to rapid depletion of existing fossil fuel 
resources [1], [2]. This has led the power industry to explore renewable energy sources such as wind, solar, and 
tidal energies. Renewable energy resources attracted more attention recently owing to their pollution free energy 
generation capabilities. Wind as a potential source for electricity generation on a large scale has been receiving 
much attention recently. [1], [3] 

Egypt now relies on burning fossil fuels to satisfy about 85% of its electricity demand, which is growing 
at a rate of 8% per year [4]. The Arab countries' fossil fuel supply is expected to dry up within the next 30-50 
years [4]. The National Renewable Energy Authority (NREA) states that Egypt generated 600 MW of power 
from wind in 2010 with a goal to generate 7.2 GW of wind power by 2020, about 12% of its total electricity 
production [4]. 

Due to the unpredictable nature of wind gust, accurate wind prediction is difficult but much needed. 
Therefore, researchers have focused on deriving accurate stochastic models for wind speed, wind direction, and 
consequently wind power prediction. These wind models are based on soft-computing either using probabilistic 
modeling (using random process estimation theories) or based on approximate reasoning using expert systems 
like neural networks, fuzzy logic, and hybrid systems [4]. 

In this paper, new time series forecast models for wind-speed prediction are proposed for Egypt's north- 
western coast, since Egypt is a very promising country for wind energy generation. All models are based on real 
data gathered for that site. The proposed method doesn't require much data in order to give a prediction with 
respectable accuracy; the inputs are correlated over severalyears to take into account seasonal changes. The 
resultant models are used to predict twenty four hours ahead based on same month of real data in seven 
consecutive years and predicts twenty four hours ahead based only one month of data using a time series 
predication schemes. The predicted wind-speed is compared to the actualdata to validate the obtained models. 

II. Wind-Speed Forecasting 

Integration of accurate wind prediction in the management and control regimes involved in the wind 
energy conversion system (WECS) provides a significant tool for optimizing operating costs and improving 
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reliability [5]. However, due to highly complex interactions and the contribution of various meteorological 
parameters, wind forecasting is a very difficult task. Stochastic techniques depend on collecting wind-speeds data 
for wind atlas preparations, wind sites monthly and annual production, and wind turbines optimum sites 
prediction using a Weibull statistical model for predicting the performance of hybrid wind systems and their 
annual production, consumption of fuel, and costs [4]. 

Very short-term forecastingis defined as look ahead periods from a few minutes up to an hour, while 
short-term forecasting, which is proposed in this paper, will indicate hours out to a few days ahead. This 
difference between the two forecasting time periods is important when trying to create a prediction system .Three 
main classes of techniques have been identified for wind forecasting. These techniques are numeric weather 
prediction (NWP) methods, statistical methods, and methods based upon artificial intelligence [6]. 

Fuzzy sets were introduced to represent and manipulate data and information that possesses non- 
statistical uncertainty. Fuzzy sets are a generalization of conventional set theory that was introduced as a new 
way to represent vagueness in the data. It introduces vagueness (with the aim of reducing complexity) by 
eliminating the sharp boundary between the members of the class from nonmembers [4]. These approaches are 
problem dependent to a large extent and converge slowly and even may diverge in certain cases. 

Weibulldistribution is the most commonly used probability density function to describe and evaluate the 
frequency of wind-speed at the selected sites [4]. Weibull distribution can be described by (1) [7]; 

/w fl (w)=-(-r 1 c~ (7) CD 

Wherevis a shape parameter, g is a scale parameter, and independent variable w is the wind-speed. If the shape 
parameter equals 2, the Weibull distribution is known as the Rayleigh distribution. For the Rayleigh distribution 

2 _ 

the scale factor, c, given the average wind speed ( w ) can be found from (v=2, and£" = — t=W ) [4]. 



Figure 1. Probability density of the Rayleigh distribution for selected sites 

In Fig. 1, the wind-speed probability density function (pdf) of the Rayleigh distribution is plotted. The 
average wind speeds in the figure are 5m/s, 5.3 m/s, and 5.4 m/s correspond to the wind speed in SidiBarrani, 
MersaMatruh and El Dabaaas three important candidate regions in Egypt's North-Western coast [7]. 

In this paper, three different wind-speed prediction models are proposed. The differences between these 
models are the size of wind-speed data block required and the scheme by which ANFIS is implemented. All 
proposed models are short-term based models, twenty four hours ahead of wind-speed forecasting. 

III. Proposed Approach 

The proposed approach in this paper is based on Kalman filter and ANFIS as a superior soft-computing 
technique. 3.1) Kalman Filter (KF) 

The Kalman filter was created by Rudolf E. Kalman in 1960, though Peter Swerling actually developed 
a similar algorithm earlier [8]. It was developed as a recursive solution to the discrete-data linear filtering 
problem. Kalman filter is based on recursive data processing algorithm and Generates optimal estimate of desired 
quantities given the set of measurements. 

Kalman Filtering is so popular because Good results in practice due to optimality and structure. 
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In order to use the KF to estimate the internal state of a process given only a sequence of noisy 
observations; the following matrices must be specified, which is represented as a linear stochastic difference 
equation. 

x k =A k x k _ l +B k u k +w k (2) 
Where: 

x k = state vector 

A k = state transition model which is applied to the previous state x k _ 1 
B k = control-input model which is applied to the control vector u k 

w k = process noise which is assumed to be drawn from a zero mean multivariate normal distribution with 
covariance Q, P(w) □ N(0,Q) 

The relationship between the process state and the measurementvaluescanbe represented as [8]: 

Z k =H k x k +v k (3) 
Where: 

Z k = measurement of system state 

H k = the observation (or measurement) z k of the true state space into the covariance R 
v k = measurement noise; p(y) ~ N(0,R) 

To find an equation that computes an a posteriori state estimate as x k a linear combination of an a priori 
estimate and a weighted difference between the actual measurement z k and a measurement prediction Hx~ k [8]. 
x = i + K(Z k -H\) (4) 
The difference (z k -Hx k ) iscalled innovation or residual. Residual of zero;meansthat, the twoterms are 
in complete agreement and k is the gain or blending factor thatminimizes the posteriori error covariance. 

Matrix k is the gain thatminimizes the a posteriori error covariance. The equationsthatneed to beminimized, 
x k =x k +K(Z k -H\) (5) 
Ongoing Discrete Kalman Filter CycleProject the state ahead, 

x k = A k x k _ x +B k u k (6) 
Project the error covariance ahead 

P k =A k P k _ l Al+Q (7) 
Compute the Kalman gain 

K k = P k H T (HP k H T + R) 1 (8) 
Update estimate with measurement z k 

x k =x k +K k (z k -Hx k ) (9) 
Update the error covariance [8] 

P k =(l-K k H)P k - (10) 
3.2) Adaptive Nemo-Fuzzy Inference Systems (ANFIS) 

Adaptive Neuro Fuzzy Inference System (ANFIS) is a fuzzy mapping algorithm that is based on Tagaki-Sugeno- 
Kang (TSK) fuzzy inference system. ANFIS is integration of neural networks and fuzzy logic and have the 
potential to capture the benefits of both these fields in a single framework. ANFIS utilizes linguistic information 
from the fuzzy logic as well learning capability of an ANN for automatic fuzzy if-then rule generation and 
parameter optimization [9]. 

A conceptual ANFIS consists of five components: inputs and output database, a Fuzzy system generator, 
a Fuzzy Inference System (FIS), and an Adaptive Neural Network. The Sugeno- type Fuzzy Inference System, 
which is the combination of a FIS and an Adaptive Neural Network, was used in this study for rainfall-runoff 
modeling. The optimization method used is hybrid learning algorithms [9]. 

For a first-order Sugeno model, a common rule set with two fuzzy if-then rules is as follows: 
Rule 1: If xi is ai and Xi is bi, then 

f 1 ^a 1 x 1 +b 1 x 1 +c 1 (11) 
Rule 2: If Xi is a 2 and x 2 is b 2 , then 

f 2 =a 2 x 2 +b 2 x 2 +c 2 (1 2) 

Where, x { and x 2 are the crisp inputs to the node and a x ,b x ,a 2 ,b 2 are fuzzy sets, a i ,b i andc i ( i= 1, 2)are the 
coefficients of the first-order polynomial linear functions. 
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It is possible to assign a different weight toeach rule based on the structure of the system, where, weights 
Wi and w 2 are assigned to rules 1 and 2 respectively and f = weighted average . 

ANFIS network is composed of five consequent layers. Each layer contains several nodes described by 
the node function. Let 0{ denote the output of the i lh node in layer j [9], [10]. 

In layer 1, every node is an adaptive node with node function 
Oi=juA(x), i = l,2 (13) 
Or 

0]=MB,_ 2 (y\ i=3,4 (14) 
Where x (or y ) is the input to the i"' node and A ( (or B._ 2 ) is a linguistic label associated with this node. The 
membership functions for A and B are usually described by generalized bell functions [11], e.g. 

MO) = 1 —^ 7 (15) 

i + — r± \ 

I Pi I 

where { p i ,q i ,r i } is the parameter set. Any continuous and piecewise differentiable functions, such as triangular- 
shaped membership functions, are also qualified candidates for node functions in this layer [4]. Parameters in this 
layer are referred to as premise parameters. 

In layer 2, each node W multiplies incoming signals and sends the product out 

Of = w. = /uA (jc) x pB^y), i = l,2 (16) 

Each node output represents the firing strength of a rule. 

In layer 3, each node N computes the ratio of the i"' rule firing strength to the sum of all rules' firing strengths 

Of = Wi=— - — , £ = 1,2 (17) 

Wy + W 2 

The outputs of this layer are called normalized firing strengths.In layer 4, each node computes the 
contribution of the i' h rule to the overall output 

Of = WiZi = Wi (a t x + b t y + c i ), i = 1, 2 (18) 

Where w > is the output of layer 3 and {a^b^c^is the parameter set. Parameters of this layer are referred to as 
consequent parameters. 

In layer 5, the single node £ computes the final output as the summation of all incoming signals 
2>A " (19) 

«*-?•»«>- 

Thus, an adaptive network is functionally equivalent to a Sugeno-type fuzzy inference system. ANFIS is 
an embedded tool in the MATLAB fuzzy toolbox. This approach is based on using the neural networks training 
capability to adjust the membership functions' (MF) parameters of the proposed fuzzy inference system (FIS). 

The proposed ANFISs utilize a subtractive clustering technique in which Gaussian MFs are used. 
Subtractive clustering generates an initial model for ANFIS training. This subtractive clustering method partitions 
the data into groups called clusters, and generates an FIS with the minimum number of rules required to 
distinguish the fuzzy qualities associated with each of the clusters. Subtractive clustering avoids the curse of 
dimensionality of grid partitioning method. Subtractive clustering is a fast, one-pass algorithm for estimating the 
number of clusters and the cluster centers in a set of data. It is especially used if there is no clear idea about how 
many clusters should be assigned for a given set of data. 

The real-data sets used to build the proposed models were obtained through a huge database website for 
weather recordings that covers almost all countries around the globe [12]. These recordings are based on real 
hourly-based measurements for the corresponding sites. The study proposed is done for MersaMatrah site as one 
of the candidate sites in Egypt that has sufficient wind resources [7]. 

IV. Simulation Results And Discussions 

In this paper the study is based on a real wind speed data gathered from Egypt north-western coast 
[12].This location has been selected based on the evaluation done in [7], as it can be considered one of the most 
promising locations at the north coast. Each subsection has a model to forecast the wind-speed for a certain period 
of time and a different data block size to obtain with four different models by the end of this section. 
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1.1. Model-I: 24-Hrs Ahead Based on Yearly Data Recordings 

The first model is based on a single month's wind-speed data in seven consecutive years e.g. the month 
of July of years 2007, 2008, 2009, 2010, 201 1, 2012 and 2013. In order to train an ANFIS, complete data sets of 
inputs along with their corresponding desired output data are needed. Thus, wind-speed data from 2007-2012 are 
used as six inputs with data sets of 2013 are used as a corresponding output. Since July is 31 days, only 30 days 
data (5040 data points) were used for training and the 31 st whole day is to be predicted using the obtained model. 
The monthly data is selected in the same season to avoid the climate change between seasons. 

Data(k) = [ Xl x 2 x 3 x 4 x 5 x 6 ] (20) 

The output training data corresponds to the trajectory prediction. 
Taiget(k) = [x 7 ] (21) 

Where: x 1 through x 7 are wind-speed data in seven consecutive years e.g. the month of July of years 
2007, 2008, 2009, 2010, 201 1, 2012 and 2013 respectively. 

The training input/output data is a structure whose first component is the six-dimensional input Data(k ) 
as in (20), and its second component is the output Tsaget(k) as in (21). 

Fig. 2 presents the wind-speed data sets in m/s for July in 2007 through 2013 from upper graph down 
respectively, these data sets were obtained from [9]. The data are hourly recordings, thus for 31 days a total of 
744 data point are shown per graph, but only 720 points per graph are used for training purposes and the last 24 
hours are to be predicted. Fig. 3 shows the generated FIS using subtractive clustering using ANFIS toolbox that 
provides a single output Takagi-Sugeno-Kang (TSK) type with linear MFs for the output. 

The upper graph of Fig. 4 shows the root mean square error (RMSE) that resulted during the ANN 
training epochs. The resultant ANFIS model is used for the purpose of testing and validation to predict 24-Hrs 
ahead. The middle graph of Fig. 4 shows the wind-speed forecasting in m/s for one complete day ahead for the 
end of July. The error over one day between actual and predicted wind-speed is shown in the lower graph of Fig. 
4. The mean value of the error (ME) is found to be around 0.2645 m/s with a mean absolute error (MAE) of 
1.6319 m/s. Figure 5 gives scattered plot for actual speed verses predicted speed month data. 
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Figure 2. Real wind-speed data for 7 months all in July (2007-2013). 
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Figure 5. Scattered plot for actual vs predicted month data 

1.2. Model-II: 24-Hrs Ahead Based on one month Data 

This model is based on only one month wind-speed data, e.g. July 2013. In this model, one month of 
hourly based wind-speed recordings are required. Data are rearranged to create a mapping from 4 samples wind- 
speed data points, sampled every 24 hours, to a predicted future of 24 hoursas shown in Fig. 6. 

~.X(k-3P) 



?(k-2Pj 



^ ^ ANnSWind 
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X(kj P) 
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Speed Data 



Figure 6. Block diagram of ANFIS without Kalman Filter (KF) 

Data(k) = [x(k-3p) x(k-2p) x(k-p) x(k)} 
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The output training data corresponds to the trajectory prediction. 

Target(k) = x(k + P) (23) 
Where k is the time instant in hours and P is the period to be predicted ( P = 24 in this case). The training 
input/output data is a structure whose first component is the four-dimensional input Data(k) as in (22), and its 
second component is the output Target{k) as in (23). There are 720 input/output data points. These data points 
are used for the ANFIS training (these became the training data set), while only part of them are used as checking 
data to validate the identified fuzzy model. 



2000 3000 4000 5000 600C 
Number of epochs 
Speed and ANFIS Prediction for one day ahead 





Figure 7.Upper graph is RMSE result for the model-I, Middel graph is actual and prediction of 24-Hrs ahead 
, where the lower graph shows the prediction error 

The one month data points (July 2013) were plotted earlier as the last graph of Fig. 2 to illustrate the data 
used for the training. These data points are then rearranged as five vectors of shifted wind-speed recordings (each 
vector is 24-Hrs shifted from its corresponding consequent vector). Training the ANFIS is done based on the 
concept of time-series prediction. RMSE resulted from the training epochs of the ANN is shown in theupper 
graph of Fig. 7. Data is then used to validate the ANFIS by predicting 24-Hrs ahead. The prediction of July 31st is 
shown in middle graph of Fig. 7, while the lower graph illustrates the prediction error which shows a quite similar 
prediction error for model-II as for model-I. The mean value of the error is found to be around 0.8545 m/s with 
MAE of 1.1975 m/s. The results obtained by model-II is similar to the results of the previous model (model-I), 
but it has a significant advantage over model-I. This advantage is that model-I has much more data points used in 
the training step (model-II uses only 14% of model-I data points). Thus, model-II is preferred over model-I. 
Figure 8 gives scattered plot for actual speed verses predicted speed month data of model II 




Figure 8. Scattered plot for actual vs predicted month data 
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1.3. Model-Ill: 24-Hrs Ahead Based on one month Data with Kalman Filter 

The statistical uncertainty is the randomness or error that comes from different sources; the five types 
of uncertainty that emerge from the imprecise knowledge are: 

• Process uncertainty: dynamic randomness. 

• Modeling uncertainty: wrong specification of the model structure. 

• Measurement uncertainty: error on observed quantities. 

• Implementation uncertainty: consequence of the variability. 

• Estimate uncertainty: appear from any source of uncertainties or a combination of them, and it is called 
inexactness and imprecision. 

Wind speed measurements obtained from [12], are subjected to some sort of uncertainties which are 
presented as vagueness of the wind speed value due to noise contents. KF is commonly used to filter out noisy 
data as it is considered the best linear unbiased estimator (BLUE). In Model-II; data set is allowed to pass initially 
through KF for the purpose of minimizing the error covariance exhibited by the data set as shown in Fig. 9. 
Estimated (filtered) data are then rearranged to create a mapping from four samples wind-speed data points, 
sampled every twenty four hours, to a predicted future of twenty four hours. 



Figure 9. Block diagram of ANFIS with Kalman Filter (KF) 

E Dak ,(k)=[x(k-3p) x(k-2p) x(k-p) x{k)] (24) 
Where: 

E Dam (k) = Estimated data Vector 

The output training data corresponds to the trajectory prediction. 

Estimated Target (k) = x(k + p) (25) 
Where k is the time instant in hours and p is the period to be predicted ( p = 24 in this case). The training 
input/output data is a structure whose first component is the four -dimensional input E Dala {k) as in (24), and its 
second component is the output Estimated Target (k) as in (25). There are 720 input/output data points. These 
data points are used for the ANFIS training (these became the training data set), while only part of them are used 
as checking data to validate the identified fuzzy model. 




Figure lO.Upper graph is RMSE result for the model-I, Mddel graph is actual and prediction of 24-Hrs ahead 
, where the lower graph shows the prediction error 

RMSE resulted from the training epochs of the ANN is shown in upper graph of Fig. 10. Data is then 
used to validate the ANFIS by predicting 24-Hrs ahead. The prediction of July 31st is shown in middle graph of 
Fig. 10, while the lower graph illustrates the prediction error. The mean value of the error is found to be around 
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0.0804 m/s with MAE of 0.6624 m/s. The results obtained by model-Ill arebetter than the results of the previous 
model (model-II), because it has a significant advantage over model-II. This advantage is that model-Ill has low 
mean absolute error this mean the prediction is more accurate by 55.5%. Thus, model-Ill is preferred over model- 
II. Figure 1 1 gives scattered plot for actual speed verses predicted speed month data of model III 



Figure 11. Scattered plot for actual vs predicted month data 



Table I. shows accuracy study between the three proposed models including the used forecasting period 
which is twenty four hours during seven months for the first model and one month for the next both of models. 
The comparison discusses: RMSE, ME, MAE and Mean Absolute Percentage Error (MAPE). 

Model-Ill using KF showed better accuracy based on error calculation. Thus; model-Ill is better than 
model-I because MAPE has been improved by 51.45% as shows in equation (26): 
c u *t> « ,i MAPEofModel-IIT 

Enhancement Ratio = (1 ) (26) 

MAPE of Model-I 

= (l-^i) = 63.17% 
28 

And model-Ill is better than model-II because MAPE has been improved by 51.45% as shows in equation 

(27): 

Enhancement Ratio = . 

MAPE of Model-II 
„ 10.31 



Table I. Accuracy study for wind-speed forecasting. 





Model-I 


Model-II 


Model-Ill 


Forecasting Period 


24-Hrs 


24-Hrs 


24-Hrs 


Amount of data 


5040 


720 


720 


points 








RMSE (m/s) 


1.9782 


1.3466 


0.7652 


Mean Error (m/s) 


0.2645 


0.8545 


0.0804 


MAE (m/s) 


1.6319 


1.1975 


0.6624 


MAPE (%) 


28.00 


21.24 


10.31 



V. Conclusion 

In this paper, three effective time series stochastic wind models for Egypt's north-western coast were 
proposed and optimized using ANFIS and Kalman filter. 

Model-I based on real wind-speed data sets for the month of July in the years 2007 through 2013; the 
target was to predict wind-speed 24-Hrs ahead. Model-I accuracy (MAE)is 1.6319 m/s. Model-II and model-Ill 
are both based on one month of data to predict 24-Hrs (July 2013);which have the advantage of using only 14% 
of the data block size and improve the accuracy in the same time. In model-Ill an initial stage of kalman filter has 
been added. KF stage has filtered out noise exhibited from measurement uncertainty. Model -III showed better 
accuracy over model-I by approximately 63.17% mean absolute error and by approximately 51.45% mean 
absolute error for model-II. 
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